Malware family classification via efficient Huffman features

نویسندگان

چکیده

As malware evolves and becomes more complex, researchers strive to develop detection classification schemes that abstract away from the internal intricacies of binary code represent without need for architectural knowledge or invasive analysis procedures. Such approaches can reduce complexities feature generation simplify process. In this paper, we present efficient Huffman features (eHf), a novel compression-based approach construction, based on encoding, where are represented in compact format, intrusive reverse-engineering dynamic processes. We demonstrate viability eHf as solution classifying into their respective families large corpus 15 k samples, indicative current threat landscape. evaluate against alternatives show our method is comparable superior accuracy, while exhibiting considerably greater runtime efficiency. Finally resilient reordering obfuscation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring Discriminatory Features for Automated Malware Classification

The ever-growing malware threat in the cyber space calls for techniques that are more effective than widely deployed signature-based detection systems and more scalable than manual reverse engineering by forensic experts. To counter large volumes of malware variants, machine learning techniques have been applied recently for automated malware classification. Despite the successes made from thes...

متن کامل

Efficient Classification of Android Malware in the wild using Robust Static Features

The ubiquitous use of Android smartphones continue to threaten the security and privacy of users’ personal information. Its fast adoption rate makes the smartphone an interesting target for malware authors to deploy new attacks and infect millions of devices. Moreover, the growing number and diversity of malicious applications render conventional defenses ineffective. Thus, there is a need to n...

متن کامل

Malware Detection using Classification of Variable-Length Sequences

In this paper, a novel method based on the graph is proposed to classify the sequence of variable length as feature extraction. The proposed method overcomes the problems of the traditional graph with variable length of data, without fixing length of sequences, by determining the most frequent instructions and insertion the rest of instructions on the set of “other”, save speed and memory. Acco...

متن کامل

Microsoft Malware Classification Challenge

The Microsoft Malware Classification Challenge was announced in 2015 along with a publication of a huge dataset of nearly 0.5 terabytes, consisting of disassembly and bytecode of more than 20K malware samples. Apart from serving in the Kaggle competition, the dataset has become a standard benchmark for research on modeling malware behaviour. To date, the dataset has been cited in more than 50 r...

متن کامل

An efficient decoding technique for Huffman codes

We present a new data structure for Huffman coding in which in addition to sending symbols in order of their appearance in the Huffman tree one needs to send codes of all circular leaf nodes (nodes with two adjacent external nodes), the number of which is always bounded above by half the number of symbols. We decode the text by using the memory efficient data structure proposed by Chen et al. [...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Forensic Science International: Digital Investigation

سال: 2021

ISSN: ['2666-2825', '2666-2817']

DOI: https://doi.org/10.1016/j.fsidi.2021.301192